Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss

نویسندگان

  • Yuchen Zhang
  • Lin Xiao
چکیده

We consider distributed convex optimization problems originated from sample average approximation of stochastic optimization, or empirical risk minimization in machine learning. We assume that each machine in the distributed computing system has access to a local empirical loss function, constructed with i.i.d. data sampled from a common distribution. We propose a communication-efficient distributed algorithm to minimize the overall empirical loss, which is the average of the local empirical losses. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, the required number of communication rounds of the algorithm does not increase with the sample size, and only grows slowly with the number of machines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DiSCO: Distributed Optimization for Self-Concordant Empirical Loss

We propose a new distributed algorithm for empirical risk minimization in machine learning. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the resul...

متن کامل

Distributed Inexact Damped Newton Method: Data Partitioning and Load-Balancing

In this paper we study inexact dumped Newton method implemented in a distributed environment. We start with an original DiSCO algorithm [Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss, Yuchen Zhang and Lin Xiao, 2015]. We will show that this algorithm may not scale well and propose an algorithmic modifications which will lead to less communications, better lo...

متن کامل

Primal-dual path-following algorithms for circular programming

Circular programming problems are a new class of convex optimization problems that include second-order cone programming problems as a special case. Alizadeh and Goldfarb [Math. Program. Ser. A 95 (2003) 3-51] introduced primal-dual path-following algorithms for solving second-order cone programming problems. In this paper, we generalize their work by using the machinery of Euclidean Jordan alg...

متن کامل

Partitioning Data on Features or Samples in Communication-Efficient Distributed Optimization?

In this paper we study the effect of the way that the data is partitioned in distributed optimization. The original DiSCO algorithm [Communication-Efficient Distributed Optimization of SelfConcordant Empirical Loss, Yuchen Zhang and Lin Xiao, 2015] partitions the input data based on samples. We describe how the original algorithm has to be modified to allow partitioning on features and show its...

متن کامل

Local Self-concordance of Barrier Functions Based on Kernel-functions

 Many efficient interior-point methods (IPMs) are based on the use of a self-concordant barrier function for the domain of the problem that has to be solved. Recently, a wide class of new barrier functions has been introduced in which the functions are not self-concordant, but despite this fact give rise to efficient IPMs. Here, we introduce the notion of locally self-concordant barrier functio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1501.00263  شماره 

صفحات  -

تاریخ انتشار 2015